AITopics | co 1

Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

Huang, Jie, Loureiro, Bruno, Mannelli, Stefano Sarao

arXiv.org Machine LearningApr-13-2026

We study the population loss landscape of two-layer ReLU networks of the form $\sum_{k=1}^K \mathrm{ReLU}(w_k^\top x)$ in a realisable teacher-student setting with Gaussian covariates. We show that local minima admit an exact low-dimensional representation in terms of summary statistics, yielding a sharp and interpretable characterisation of the landscape. We further establish a direct link with one-pass SGD: local minima correspond to attractive fixed points of the dynamics in summary statistics space. This perspective reveals a hierarchical structure of minima: they are typically isolated in the well-specified regime, but become connected by flat directions as network width increases. In this overparameterised regime, global minima become increasingly accessible, attracting the dynamics and reducing convergence to spurious solutions. Overall, our results reveal intrinsic limitations of common simplifying assumptions, which may miss essential features of the loss landscape even in minimal neural network models.

artificial intelligence, co 1, machine learning, (18 more...)

arXiv.org Machine Learning

2604.09412

Country:

North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)
Europe > France (0.04)
Asia > Singapore (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Locality Sensitive Hashing in Fourier Frequency Domain For Soft Set Containment Search Indradyumna Roy

Neural Information Processing SystemsFeb-16-2026, 14:15:41 GMT

In many search applications related to passage retrieval, text entailment, and sub-graph search, the query and each'document' is a set of elements, with a document

data mining, hinge distance, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > Spain (0.04)
Asia > India > Gujarat > Gandhinagar (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.87)
(2 more...)

Add feedback

7016d7b7b6e3c05b2128ac5b3aae492d-Paper-Conference.pdf

Neural Information Processing SystemsFeb-14-2026, 02:06:09 GMT

artificial intelligence, machine learning, pt ii, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

2f1486343c2c942a617e4f5bb0cc64c8-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 15:55:57 GMT

equation, equilibrium, producer, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

SingleModelUncertaintyEstimationviaStochastic DataCentering

Neural Information Processing SystemsFeb-8-2026, 09:06:06 GMT

Using theneural tangent kernel (NTK), we demonstrate that this phenomena occurs inpart because the NTK is not shift-invariant.

anchor, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

MinimaxLowerBoundsforTransferLearningwith LinearandOne-hiddenLayerNeuralNetworks

Neural Information Processing SystemsFeb-7-2026, 14:05:05 GMT

Despiterecentempirical success of transfer learning approaches, the benefits and fundamental limits of transfer learning are poorly understood.

artificial intelligence, generalization error, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Locality Sensitive Hashing in Fourier Frequency Domain For Soft Set Containment Search

Neural Information Processing SystemsOct-9-2025, 04:57:57 GMT

In many search applications related to passage retrieval, text entailment, and sub-graph search, the query and each'document' is a set of elements, with a document

data mining, hinge distance, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Asia > Afghanistan > Parwan Province > Charikar (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > Spain (0.04)
Asia > India > Gujarat > Gandhinagar (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Information Management > Search (0.87)
(2 more...)

Add feedback

Decipher the Modality Gap in Multimodal Contrastive Learning: From Convergent Representations to Pairwise Alignment

Yi, Lingjie, Douady, Raphael, Chen, Chao

arXiv.org Artificial IntelligenceOct-9-2025

Multimodal contrastive learning (MCL) aims to embed data from different modalities in a shared embedding space. However, empirical evidence shows that representations from different modalities occupy completely separate regions of embedding space, a phenomenon referred to as the modality gap. Moreover, experimental findings on how the size of the modality gap influences downstream performance are inconsistent. These observations raise two key questions: (1) What causes the modality gap? (2) How does it affect downstream tasks? To address these questions, this paper introduces the first theoretical framework for analyzing the convergent optimal representations of MCL and the modality alignment when training is optimized. Specifically, we prove that without any constraint or under the cone constraint, the modality gap converges to zero. Under the subspace constraint (i.e., representations of two modalities fall into two distinct hyperplanes due to dimension collapse), the modality gap converges to the smallest angle between the two hyperplanes. This result identifies \emph{dimension collapse} as the fundamental origin of the modality gap. Furthermore, our theorems demonstrate that paired samples cannot be perfectly aligned under the subspace constraint. The modality gap influences downstream performance by affecting the alignment between sample pairs. We prove that, in this case, perfect alignment between two modalities can still be achieved via two ways: hyperplane rotation and shared space projection.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2510.03268

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.45)

Add feedback

7016d7b7b6e3c05b2128ac5b3aae492d-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 21:25:30 GMT

artificial intelligence, machine learning, pt ii, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Beijing > Beijing (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.67)

Add feedback

A Details of the empirical setup in Section 3.4

Neural Information Processing SystemsOct-8-2025, 09:34:20 GMT

Our model is one of the simplest possible that studies specialization in the supply-side marketplace. First, the infinite, high-dimensional content embedding space captures that digital goods can't be cleanly clustered into categories, but rather, are often mixtures of different dimensions (e.g. a movie can be both a drama and a comedy). See Anderson et al. [ 1992 ] for a textbook treatment. The assumption that all producers share the same cost function is also simplifying, but, potentially surprisingly, still allows us to study specialization. Proposition 4. F or any set of users and any 1, a pure strategy equilibrium does not exist.

equation, equilibrium, producer, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.46)

Add feedback

Filters

Collaborating Authors

co 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Sharp description of local minima in the loss landscape of high-dimensional two-layer ReLU neural networks

Locality Sensitive Hashing in Fourier Frequency Domain For Soft Set Containment Search Indradyumna Roy

7016d7b7b6e3c05b2128ac5b3aae492d-Paper-Conference.pdf

2f1486343c2c942a617e4f5bb0cc64c8-Supplemental-Conference.pdf

SingleModelUncertaintyEstimationviaStochastic DataCentering

MinimaxLowerBoundsforTransferLearningwith LinearandOne-hiddenLayerNeuralNetworks

Locality Sensitive Hashing in Fourier Frequency Domain For Soft Set Containment Search

Decipher the Modality Gap in Multimodal Contrastive Learning: From Convergent Representations to Pairwise Alignment

7016d7b7b6e3c05b2128ac5b3aae492d-Paper-Conference.pdf

A Details of the empirical setup in Section 3.4